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A GREEDY ALGORITHM FOR THE MINIMIZATION OF A RATIO 
OF SAME-INDEX ELEMENT SUMS FROM TWO POSITIVE 

ARRAYS 


ALEXANDER LOZOVSKIY* 

Abstract. Consider two ordered positive real number arrays of equal size. The problem is to 
find such set of indices of given size that the ratio of the sums of the array elements with those 
indices is minimized. In this work, in order to mitigate the exponential complexity of the brute force 
search, we present a greedy algorithm applied to the search of such an index set. The main result 
of the paper is the theorem that states that the algorithm eliminates from candidates all index sets 
that do not contain any elements from the greedily selected set. We additionally prove exactness for 
a particular case of a ratio of the sums of only two elements. 
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1. Problem statement and motivation example. Consider two finite se- 
quences(arrays) a = {oi, 02 ,ojv} and b = { 61 , 62 , •■•7 ^at} such that ai,bi € R. 
and ai,bi > 0 for all 1 ^ f ^ iV. Let positive integer n satisfy n < N and let 
/ = {ii, Z 2 ,in} denote a set of distinct indices, such that 1 ^ ^ A for all 

1 ^ k ^ n. We distinguish between different index sets by putting a superscript 
above: etc. For example, 




Let ..., j(^)} collect all possible such sets. Clearly, the cardinality 

of y is 


\J^\=M = 


Nl 


nl{N — n)l 

We wish to find such index set in that the fraction 


E n 


ELi 


is minimized. In other words. 


a.(s) + a.(s) + ... aAn) 

• £"1 t-o 

m = argmin—i---. 

lsJs<M 0j(«) + 0j(o) + + OjCi) 

For example, let A = 4 and n = 2 and 

a = {3,2,5, 7}, 6 = 16,2,2,8}. 

The index set I = (1, 2}, corresponding to elements oi = 3, 02 = 2 and 61 = 6, 62 = 2, 
corresponds to the least fraction of all ^ = 6 possible ones: 

Sfc=l _ 3+1 _ 1 ^ 

Yrk=iK ~ 6+2 " 2- 
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Remark. Note that the indices for the numerator and the denominator of the 
fractions of interest are the same, and this is a core constraint of the problem, that 
makes it non-trivial, as opposed to the case when elements from a and b can be chosen 
independently. In the latter case, the trivial sequential search for the minimal sum of 
elements from a and the maximal sum of the elements from b will result in complexity 
0{nN + ri^). 

1.1. Motivation example. Despite being an independent combinatorial opti¬ 
mization problem in its own right, we present one non-trivial motivation example for 
it. Such minimization may stem from the reduced-order numerical modeling of non¬ 
linear partial differential equations. Consider the problem of the best approximation 
in 2-norm of a vector / £ by a subspace S C such that § =span([/), where U 
consists oi L < N orthonormal vectors Uj £ Obviously, the best approximation 

/ £ S in 2-norm is the projection via the least squares: 

/= UU^f. 

In computational applications, the vector / can be a time-dependent term arising due 
to the space-time discretization of a nonlinear term in a partial differential equation 
[l[. The evaluation of this term at every time step or Newton iteration may be time 
consuming and does not agree with a reduced-order modeling mission to provide 
a computationally cheap scheme. Therefore, an approximation of / other than the 

projection / is required, since ealuation of / implies the evaluation of all the N entries 
of /. 

One strategy called hyper-reduction ii is to employ an incomplete vector eval¬ 
uation of f when building its approximation. A method known as Gappy POD ii 
finds the approximation based on only few evaluated entries oi f. If we choose a lim¬ 
ited amount n <C iV of entries for evaluation and n ^ L, this creates a permutation 
(selection) matrix P of size N x n. Let q = Uc denote such approximation of / in 
subspace S. Then 

1/- = 1/- /+ /- 9T = 1/- /T + I/- 

due to the Pythagorean law. The minimization oi \ f — q\ is then equivalent to the 

minimization of | / — ^. The Gappy POD method chooses the coefficient vector c by 
solving the least squares problem 

c = argmin|P^/— P^t/a| 

oeR" 


and so 

c = (P'^PP^P)-^P'^PP'^/ 

Suppose all instances (snapshots) of / in the application of interest do not leave some 
subspace Q C [1, 01, and Q =span(P, U), where U also consists of orthonormal 
columns and U (lU = {0}. 

So this creates 


I/-91 


u^f- {{u^pp'^uy^u^pp'^iuu'^f + uu^f)) = 

pp^uy^u'^pp'^uu'^f . 


(i.i) 
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Since the general approximation strategy is not dependent on / but only on the 
subspace it resides in, we can isolate / via the inequality 


1 /- ^ ^ iu^pp'^uy^u^pp^u 



and focus on constructing such index set for the entries of /, i.e. matrix P, that 
they minimize the norm of [U'^PP"^U)~^U'^PP'^U. The minimization of this norm 
directly may be challenging. Instead, we can try to minimize its upper bound: 


{u'^pp'^uy^u^pp^u 


Tm-ll 


< lUU^ PP^ U) 


Ip^pII 


p'^u 


where ll’H^ denotes the Frobenius norm of a matrix. 

Consider the case when subspace § is one-dimensional, i.e. U = u is a single unit 
vector. Then the aforementioned upper bound turns into 


P'^U 

F 

\P^U 

1 


Clearly, if this upper bound is squared and we denote = uf, Ui = then 

we arrive at the problem statement at the beginning of this section: find such index 
set {ii,i 2 , i-e. selection matrix P of size N x n, that 


2 

pTfr 

p _ Oil + 

IP^mI^ bii + bi2 + ■■■ + bi^ 
is the smallest fraction of all. 

We now present a cheap greedy method for solving the minimization problem 
involving general positive arrays. 

2. The greedy method. Clearly, only when n=lorn = A^— 1, the brute force 
search procedure, which tries fractions corresponding to all possible index sets from 
has cost 0{N), i.e. scales linearly with the size of the arrays N. For other cases, 
the cost is superlinear. Moreover, if the number of the array entries n grows with its 
size as n = 0{N), the search cost increases, according to Stirling’s approximation Q, 
as 


O 



where 1 < c ^ 2, yielding the brute force search an impractical method. 

A g reedy algorithm is a well-known heuristic method of combinatorial optimiza¬ 
tion [9|. It is locally optimal, meaning it chooses the best solution only within a 
subproblem at the current iteration, and never reconsiders its choices. These two 
features make it a computationally affordable method. 

Here we are proposing a greedy algorithm that has a cost 0{N) per iteration. It 
is presented below as Algorithm [T] 

Algorithm 1 . 

Input: arrays a = {oi, 02,..., oat} > 0 and b = {61, &2, bpf} > 0, a positive integer 
n < N. 
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j := argmin^ 

^ ^ -\T ^ 




I ■■= {J} 

for m = 2 : n do 



Cli-^ 


1 +afc 


_l+^fc 


k^I 


/:=/U{j} 


end for 

Output: index set I. 

The idea of the algorithm is simple. At iteration m, the algorithm sequentially 
searches for a single index corresponding to the minimization of the fraction with m 
numbers in both denominator and numerator, sharing the same positions, in which 
the other m — 1 indices have been selected at previous iterations. Due to this linear 
search, the cost of each iteration is proportional to fV, which grants the total cost 
0{nN + n^) to the entire algorithm. The other benefit is in the fact that at every 
fraction evaluation we only perform two additions - one on top and the other on the 
bottom. This is opposed to evaluations during the brute force search, in which we 
perform up to 2(n — 1) additions. 

It should be noted that the ratio minimization problem may have more than one 
solution. The same is true for the greedy method, due to non-uniqueness, in general, 
of the output of the argmin operation at every iteration. The result of this work is 
general and applies to all possible solutions. 

The main result of the paper is the first statement of Theorem 13.31 

3. The main result. Lemma 3.1. Let ii,i 2 , ■■■■,in denote the indices chosen by 
the greedy method{^in the order of iterations. Let 



for 1 ^ k ^ n. Then the sequence {qk} is non-decreasing for 1 ^ k ^ n. 

Proof. We shall prove this lemma by induction. Let us compare the first two 
elements of the sequence. By definition. 


qi = 7 —, 92 



Since, according to the greedy algorithm. 




we obtain 


^ 21^22 ^ ^ 22^2 


'22 ^2l ? 


which is equivalent to 


nq 0^622 ^ H“ 0-22^211 


nq ^ Uq “h ^22 

^ 2 i H“ ^22 
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or 







The base of induction is complete. Now, let us assume the inequality qm-i ^ 9m 
holds for some index m ^ 2, and we wish to show that ^ 9m+i- From 


hi + bi^ 


^ F F F 




hi F bi2 + ... + &i. 


h. 


we equivalently obtain 

bim (®ii F 0^2 + ... + ^ F 6i2 F ... + bi^_-^). 

Aslo, due to the greedy index selection at iteration m, we have 


(3.1) 


(3.2) 


■ Oir. 




bil F t»i2 


-1 F 


€ 


Oil F 0^2 F ... + F Cli^ 
F ^im-1 F h 


bil F foi2 


im + l 


which is equivalent to 


F + ... + + Qi^) — + h^ + ... + h^_^ + hm) ^ 

^ hmiO'ii F 0^2 F ... + — ai^{hi F h2 F ... + h„i-i)- 

The right-hand side of the above inequality is non-positive, due to (13.2|) . and therefore 

&im+i (®ii F 0^2 F cii ^) ^ Ojim+i {bii + bi^ bi^_-^ + bi^). 

Analogously to the way ()3.2I) implies (13.IL we conclude from the above inequality 
that 


Oi, 


ai„ 


hi F h2 F ... -b 




Oil F Oi, ai^ + Ui, 


*m+l 




□ 

Lemma 3.2. Consider two arrays of equal size n consisting of positive real num¬ 
bers {xi}, {yi}. Then the array {zi} defined as 

Zi = Xi{yi -b j/2 F ... -b yn) - Viixi -b a;2 -b ... -b Xn) 

for all 1 ^ i ^ n, contains at least one non-positive element. If there are no strictly 
negative elements in {zi}, then 

^ = C 


for all 1 ^ i ^ n, where C is independent ofi. 

Proof. Let us add all the elements of the array {zi} as shown: 

n n n n 

Zi + Z2 + ... + Zn = '^ Xi '^ yk -'^ yi'^Xk = 0. 


i—1 k—1 


i—l k—1 


Hence, at least one of Zi has to be non-positive in order for this identity to hold. 

If there are no strictly negative Zi, then from zi -\- Z 2 Zn = 0 it follows that 

Zi = Q for any 1 ^ i ^ n, and the second statement of the lemma follows immediately. 

□ 

Theorem 3.3. Consider two arrays {at}, {h} of equal size N consisting of 
positive real numbers and let n be such integer that 2 ^ n < N. Let In denote an 
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index set (one of) with n indices constructed by the greedy method]^ and J„ the actual 
minimizing index set (one of) with n indices. If at least one of the equalities 



_ _ 


bji 

bj-2 

'■ b, 

Jn 


if false, then Jn r\ In tZ). If all these equalities are true, then the greedy algorithm is 
exact, meaning it provides the actual minimizing index set(one of). 

Proof 

Proof of the first statement. The statement is obvious for the trivial case 
n > Y- So below we consider the case when n ^ y- Denote S = {1,2,..., N} and 

P = Oil + ai^ + ... + az„_2 + Oin_i) 


q — + bi^ + ... + + bi^_j^. 

If n = 2, everywhere below terms + Oij + ... + ai „_2 and 6^^ + bi^ + ... + ^i „_2 are 
treated as zeros and Iq = 0. 

We shall prove this theorem by contradiction. That means assuming Jn and 
do not have a single index in common. Then 

P + g/c Qji + aj2 + ■■■ + .o o') 


for any k € S \ /„_i, or 

^ ^ bji + 6 j 2 + ■■■ + bj^ ^ ^ o-kjbj^ + bj 2 + ■■■ + bj„) - bkjaji + aj 2 + ■■■ + aj„) 

+ gj 2 + ... + + gj 2 + ... + gj„ 

for any k G S \ In-i- In addition to this inequality, we have 

p ^ g^j + ai 2 + + gi „_2 + a; 

q bi^ + bi.^ + ... + 6 i „_2 + bi 

for any I G S \ In- 2 , due to the greedy index selection at iteration n — 1. Then 

bi.^ + 6^2 + ... + I'i „_2 + bi 

Q ^ - P 

gp + gi 2 + + nin -2 “b 

for any I G S \ In- 2 - From here, it follows 

&ii + &i 2 + ■•■ + bi„_2 + h 

- P ^ 

gp + Gi2 + ... + ai^_2 + ai 

^ bji + bj.2 + ■■■ + bj„ Gkjbj^ + bj2 + ... + ~ ^fe(gji + Gj2 + ... + g^Vi) 

“ii + gj 2 + ■•■ + Ojn Oil + gj 2 + •■• + 

for any pair l,k G S \ In-i- The right-hand side of the above inequality is a linear 
function of p. In this function, the slope is fixed, but the free term varies depending 
on index k. Since J„ C S \ In-i, we can employ Lemma l3.2l to conclude that at least 
for one index k from S \ In-i the free term is strictly negative. 
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Therefore, for this inequality to hold, it is necessary that 


bii + foij ■■■ ^in-2 bi ^ bj^ + bj2 + ... + bj^ 


ai 


CLj. + a,„ + ... + a,- 


Oil + 0,12 + 

for any I € S \ In-i- In other words, 

{bi^ +^Z2 + ... + bi^_2){o,ji + ) ~ (^tl +^j2 + ■■• + bj^){ai^ +Cli2 +•■• + 0 *^- 2 ) ^ 


< + bj2 + ... + bj^) - bi{aj^ + aj^ + ... + aj^). 

By accumulating this inequality over all I G Jn = {ji,j 2 , we end up with 

(&zi bi^ + ... + &i„_2)(nji + 0^2 + ... + o,j„) 


~ibji + bj2 + ... + bj.^)(ai^ + ai2 + ... + 0'i„_2) < 0. 


(3.4) 


If n = 2, that means 0 < 0, and the proof by contradiction is complete. From here 
on, we assume n > 2. Due to the greedy index selection at interation n — 2, we have 

Oil + 0,12 + + nin-2 ^ nq 4" 4" 4- o,in-3 4" n; 

4- bi^ + ... + fei„_2 &q 4- bi^ + ... + ^i„_3 4- bi 

for any I G S \ In- 3 , or 

(«q +<ii2 + ---+<^in-2)ibii +bi2 + ---+bi^_2) — {ai^^ +ai2 + ...+Oi^_3)(6ij4-6i2 4-...4-I»i„_2) ^ 

^ o-iibi.^ + bi 2 + ... + bi^_^) — bi[ai.^ + + ... + ai.^_ 2 ). 

By accumulating this inequality over all I G Jn, end up with 

Tn{ai.^ + 0^2 + ... + ai^_2){bi.^ + 6^2 4- ... + bi^_2) 

—m{ai.^ + CLi2 + ... + ai„_2)(bi.^ + bi^ + ... + bi^_2) ^ 


^ (6*1 4-6^2 4-... + 6i„_2)(aji 4-0^2 4-... +a^;,) — (6^^ 4-6^2 + ... + 6j„)(ai3 + 0^2 4-... + ai„_2). 

Thanks to (EH), the right-hand side of the above inequality is strictly bounded by 
zero. If n = 3, we again arrive at 0 < 0, thus completing the proof. If n > 3, we get 

dij “t” 0*2 4“ ... “t” Oi^_2 ^ 0*1 “t” 0*2 -f ... “t” 0*^_3 


6*1 + 6^2 + ... 4- bi 


6q 4- 6^2 + ... + bi 


According to Lemma l3.11 we obtain contradiction. Therefore, sets /** and J„ have at 
least one index in common. 

Proof of the second statement. Let 

%1 _ _ _ Qjn _ Q 

bji bj.2 bj^ 

Since J„ contains indices corresponding to the actual least fraction (one of), we have 


^ ^ 0*1 + Oj2 


6*1 4- bj2 


0*1 -I- C{bj2 +bj2 + ... + bj^) 
6q 4- 6 j 2 -I- 6j3 -I- ... -I- 
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From here, 




But since = C, at the first iteration the greedy method will definitely choose such 


index ii, that = C. 
Due to Lemma ixn 


^c. 


Once again, since 


^32 


= C 


the second iteration of the greedy method will arrive at such index set J 2 , that 
= C. Continuing this way until we complete all n iterations, we finally 

arrive at 


Oil + 0^2 + ... + 

bi-. + bi2 + ■■■ + hi 


= C, 


which is the value of the least fraction. □ 

Proposition 3.4. Let the conditions of the first statement of the theorem[ 
hold for n = 2. Then the greedy algorithm is exact, meaning it provides the actual 
least fraction. 

Proof. Denote by {fi, 12 } the index set constructed by the greedy algorithm. Sup¬ 
pose index set {ji,j 2 } corresponds to the actual least fraction of all possible 

ones. From Theorem 13.31 one of those, say, ji, must equal 12 - 

We again shall use the method of contradiction. Assume j 2 ii. 

We have 


lii -I- a; ^ ^ 


^32 


bii + bi 


for any index I, as long ds I ^ ii. Thefore, 

(u22 T ^j2'}bi (1112 A ^32^)^^ ^ A 6^2 ai^bi,^. 

Since, due to the greedy index selection at the first iteration of Algorithm [U we have 


aubi., < aubi 


and 


Oil bj2 ^ 0^2 bi^ , 


it follows that 


(oj2 + aj^)bi - (oi2 + 0^2^ 0 


for any I ^ ii. Plug I = 12 and I = j 2 - Then we obtain two inequalities: 









Thus ^ ^. But this is not possible due to the assumption of the propostion. So 

0^2 Oj2 

J 2 = *1 and the proposition is proven. □ 

Theorem 13.31 puts a restriction on a position of the least fraction in a sense that 
its indices must intersect with those constructed by the greedy method. It reduces 
the number of possibilities that we would have to cover if we were to use the brute 
force search procedure: from to “ n!(w- 2 n)! ■ 

We conclude this work by noting that despite providing only a seemingly weak 
restriction for the location of the least fraction based on the greedy algorithm, we are 
aware of the fact that the greedy method should not, in general, provide the exact 
solution to the minimization problem, i.e. In ^ Jn in general, which slightly excuses 
Theorem 13.31 for its pessimistic estimate. A concise example of the inexactness of the 
greedy method is below: 


a = {l,3,6,4}, 6= {10,3,12,6}. 

Clearly, when n = 3, the greedy method chooses ii = 1, Z 2 = 2, = 3. The minimal 

fraction is, however, located at indices ji = 1, j 2 = 3, j 3 = 4, which is easy to verify. 

A future work on the subject may involve obtaining better estimates for the 
position of the minimal fraction based on the result of the greedy method or proving 
that the estimate fl 0 cannot in general be improved. Finally, obtaining 

upper bounds for the difference between the greedily selected fraction and the actual 
least fraction is another potential research direction. 
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